144 research outputs found
Programming MPSoC platforms: Road works ahead
This paper summarizes a special session on multicore/multi-processor system-on-chip (MPSoC) programming challenges. The current trend towards MPSoC platforms in most computing domains does not only mean a radical change in computer architecture. Even more important from a SW developerยดs viewpoint, at the same time the classical sequential von Neumann programming model needs to be overcome. Efficient utilization of the MPSoC HW resources demands for radically new models and corresponding SW development tools, capable of exploiting the available parallelism and guaranteeing bug-free parallel SW. While several standards are established in the high-performance computing domain (e.g. OpenMP), it is clear that more innovations are required for successful\ud
deployment of heterogeneous embedded MPSoC. On the other hand, at least for coming years, the freedom for disruptive programming technologies is limited by the huge amount of certified sequential code that demands for a more pragmatic, gradual tool and code replacement strategy
Efficient Exploration of Bus-Based System-on-Chip Architectures
Separation between computation and communication
in system design allows system designers to explore the communication
architecture independently after component selection and
mapping decision is made. In this paper, we present an iterative
two-step exploration methodology for bus-based on-chip communication
architecture for multitask applications. We assume that
the memory traces from the processing components are given.
The proposed methodology uses a static performance estimation
technique extended for multitask applications to reduce the design
space quickly and drastically and applies a trace-driven simulation
to the reduced set of design candidates for accurate performance
estimation. For the case that local memory traffics as well as
shared memory traffics are involved in bus contention, memory
allocation is considered as an important axis of the design space
in our technique. Experimental results show that the proposed
methodology achieves significant performance gain by optimizing
on-chip communication only, up to almost 100% compared with
an initial single shared bus architecture, in both two real-life
examples, a four-Channel digital video recorder and an equalizer
for OFDM DVB-T receiverThis work
was supported by the National Research Laboratory Program under Grant
M1-0104-00-0015 and the IT Leading Research and Development Support
Project funded by Korean MIC
Memory Efficient Software Synthesis with Mixed Coding Style from Dataflow Graphs
This paper presents a set of techniques to reduce the code and
data sizes for software synthesis from graphical digital signal-processing
programs based on the synchronous dataflow model. By sharing the
kernel code among multiple instances of a block with a shared function,
we can further reduce the code size below the previous results based on
inline coding style. A systematic approach also is devised to give up the
single appearance schedule for reducing the data buffer requirement. The
proposed techniques have been evaluated with two real-life examples to
prove their significance.This work was supported by the academic
fund of Ministry of Education, Republic of Korea, through the Inter-
University Semiconductor Research Center, Seoul National University, under
ISRC-98-E-2103
Transformation and VHDL Code Generation from Coarse-grained Dataflow Graph
This paper discusses how we generate VHDL codes for DSP applications described in dataflow graphs. Because the generated VHDL code implements the details of the control structure, we can easily transform it into a running circuit without any modifications, using logic synthesis tools. To improve the quality of the synthesized circuit we apply some graph transformation techniques to the original dataflow graph. We mainly consider coarse-grained dataflow graphs in which each node corresponds to an IP component of considerable size. The proposed facility is very useful for dataflow graph based high level design tools, including our codesign framework PeaCE (Ptolemy extension as Codesign Environment)
Efficient Hardware Controller Synthesis for Synchronous Dataflow Graph in System Level Design
AbstractโThis paper concerns automatic hardware synthesis
from data flow graph (DFG) specification in system level design. In
the presented design methodology, each node of a data flow graph
represents a hardware library module that contains a synthesizable
VHDL code. Our proposed technique automatically synthesizes a
clever control structure, cascaded counter controller, that supports
asynchronous interaction with outside modules while efficiently
implementing the synchronous dataflow semantics of the graph
at the same time. Through comparison with previous works with
some examples, the novelty of the proposed technique is demonstrated.This work
was supported by the National Research Laboratory (NRL) Grant and the Brain
Korea 21 Project. The RIACT at Seoul National University provides research
facilities for this study
Schedule-Aware Performance Estimation of Communication Architecture for Efficient Design Space Exploration
In this paper,we are concerned about performance estimation
of bus-based communication architectures assuming that
task partitioning and scheduling on processing elements are already
determined. Since communication overhead is dynamic and
unpredictable due to bus contention, a simulation-based approach
seems inevitable for accurate performance estimation. However,
it is too time-consuming to be used for exploring the wide design
space of bus architectures. We propose a static performance-estimation
technique based on a queueing analysis assuming that the
memory traces and the task schedule information are given. We
use this static estimation technique as the first step in our design
space exploration framework to prune the design space drastically
before applying a simulation-based approach to the reduced design
space. Experimental results show that the proposed technique
is several orders of magnitude faster than a trace-driven simulation
while keeping the estimation error within 10% consistently in
various communication architecture configurations.This work was supported by the National Research Laboratory under Program
M1-0104-00-0015, Brain Korea 21 Project, and the IT-SoC project. ICT
at Seoul National University provided research facilities for this study
- โฆ